A Hybrid Approach for Co-Channel Speech Segregation based on CASA, HMM Multipitch Tracking, and Medium Frame Harmonic Model
نویسندگان
چکیده
This paper proposes a hybrid approach for cochannel speech segregation. HMM (hidden Markov model) is used to track the pitches of 2 talkers. The resulting pitch tracks are then enriched with the prominent pitch. The enriched tracks are correctly grouped using pitch continuity. Medium frame harmonics are used to extract the second pitch for frames with only one pitch deduced using the previous steps. Finally, the pitch tracks are input to CASA (computational auditory scene analysis) to segregate the mixed speech. The center frequency range of the gamma tone filter banks is maximized to reduce the overlap between the channels filtered for better segregation. Experiments were conducted using this hybrid approach on the speech separation challenge database and compared to the single (non-hybrid) approaches, i.e. signal processing and CASA. Results show that using the hybrid approach outperforms the single approaches. Keywords—CASA (computational auditory scene analysis); cochannel speech segregation; HMM (hidden Markov model) tracking; hybrid speech segregation approach; medium frame harmonic model; multipitch tracking, prominent pitch.
منابع مشابه
Monaural Voiced Speech Separation with Multipitch Tracking
Separating voiced speech from its mixtures with interferences in monaural condition is not only an important but also challenging task. As multipitch tracking can enable much better performance of speech separation for CASA systems, we propose a new multipitch determination algorithm, which can be used under various kinds of noise conditions. In the process of multipitch estimation, a new repre...
متن کاملMultipitch tracking using a factorial hidden Markov model
In this paper, we present an approach to track the pitch of two simultaneous speakers. Using a well-known feature extraction method based on the correlogram, we track the resulting data using a factorial hidden Markov model (FHMM). In contrast to the recently developed multipitch determination algorithm [1], which is based on a HMM, we can accurately associate estimated pitch points with their ...
متن کاملSpeech enhancement based on hidden Markov model using sparse code shrinkage
This paper presents a new hidden Markov model-based (HMM-based) speech enhancement framework based on the independent component analysis (ICA). We propose analytical procedures for training clean speech and noise models by the Baum re-estimation algorithm and present a Maximum a posterior (MAP) estimator based on Laplace-Gaussian (for clean speech and noise respectively) combination in the HMM ...
متن کاملInterfacing Sound Stream Segregation to Automatic Speech Recognition - Preliminary Results on Listening to Several Sounds Simultaneously
This paper reports the preliminary results of experiments on listening to several sounds at once. Two issues are addressed: segregating speech streams from a mixture of sounds, and interfacing speech stream segregation with automatic speech recognition (ASR). Speech stream segregation (SSS) is modeled as a process of extracting harmonic fragments, grouping these extracted harmonic fragments, an...
متن کاملMultipitch Tracking for Noisy and Reverberant Speech
Abstract – Multipitch tracking in real environments is critical for speech signal processing. Determining pitch in reverberant and noisy speech is a particularly challenging task. In this paper, we propose a robust algorithm for multipitch tracking in the presence of both background noise and room reverberation. An auditory front-end and a new channel selection method are utilized to extract pe...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1312.4127 شماره
صفحات -
تاریخ انتشار 2013